-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: Add kube_deployment_spec_topology_spread_constraints metric for issue #2701 #2728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add kube_deployment_spec_topology_spread_constraints metric for issue #2701 #2728
Conversation
- Adds new metric to count topology spread constraints in deployment pod templates - Includes comprehensive test coverage for both cases (with/without constraints) - Follows existing patterns and stability guidelines
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: SoumyaRaikwar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
How would you use this metric for alerting or to provide info about the deployment? |
These topology spread constraint metrics enable critical alerting on workload distribution policies: kube_deployment_spread_topology_constraint_metric > 0 helps detect when spread constraints exist but workloads become unevenly distributed across zones/nodes during resource pressure. You can alert on missing distribution policies with (kube_deployment_spec_replicas > 1) and (kube_deployment_spread_topology_constraint_metric == 0) to identify multi-replica deployments lacking proper spread configuration. For dashboards, count(kube_deployment_spread_topology_constraint_metric > 0) shows cluster-wide adoption of topology spread policies, complementing the pod affinity/anti-affinity metrics I implemented in PR #2733. During incidents, these metrics help correlate why workloads became concentrated in specific topology domains or why pods failed to schedule due to overly restrictive spread policies. This completes the scheduling observability suite from issue #2701 - together with my pod affinity/anti-affinity metrics (PR #2733), operators now have full visibility into both co-location/separation rules AND even distribution policies across cluster topology. Thanks @mrueg! |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Same comment as in the other PR #2733 applies here, we should have explicit metrics per |
What this PR does / why we need it
This PR adds the
kube_deployment_spec_topology_spread_constraints
metric that counts the number of topology spread constraints defined in a deployment's pod template specification.This PR solves the topology spread constraints monitoring requirement from issue #2701, which specifically requested visibility into scheduling primitives including "pod topology spread constraints" for workload pod distribution monitoring.
Which issue(s) this PR fixes
Solves topology spread constraints monitoring from #2701 - Add schedule spec and status for workload
Problem Solved
Issue #2701 identified that operators need to monitor various scheduling primitives to detect when "break variation may happen because pod priority preemption or node pressure eviction."